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SOME  ALTERNATIVES  TO  BAYES'  RULE 


Persi  Diaconis 
and 

Sandy  Zabell 

ABSTRACT 

We  review  Bayes’  rule,  Jeffrey's  rule,  and  Dempster's  rule  as  methods 
of  revising  probability  judgments  based  on  new  evidence. 

I.  INTRODUCTION 

There  are  several  different  approaches  to  what  might  be  called  "the 
mathematics  of  changing  one's  mind."  The  most  frequently  discussed  method, 
Bayes'  rule,  changes  a  prior  or  initial  probability  P  to  a  posterior  or 
final  probability  P*,  based  on  the  occurrence  of  an  event  E.  It  speci¬ 
fies  that  for  any  event  A: 

BAYES'  RULE  P*(A)  =  P(A  and  E)/P(E)  . 

Bayes'  rule  is  not  (at  least  directly)  applicable  if 

•  New  information  does  not  arrive  in  the  form  "event  E  occurred" 

(e.g.,  the  murderer  was  a  woman),  but  instead  in  the  form,  "the 
odds  on  E  have  changed"  (e.g.,  the  murderer  was  likely  to  have 
been  a  woman).  This  is  sometimes  called  the  problem  of  probable 
knowledge. 

•  Even  if  "E  occurs,"  we  may  not  have  thought  about  E  beforehand. 

Thus  we  will  not  have  previously  assessed  either  P(A  and  E)  or 
P(E),  and  will  therefore  be  unable  to  make  direct  use  of  Bayes* 

rule.  We  will  call  this  the  problem  of  unanticipated  knowledge. 


^ This  review  focuses  on  two  proposed  alternatives  to  Bayes'  rule  for 
revising  probability  assessments  in  the  face  of  new  information:  Richard 
Jeffrey's  rule  of  conditioning  and  Arthur  Dempster's  rule  of  combination. 
Section  2  describes  Jeffrey's  rule.  Section  3  describes  upper  and  lower 
probabilities  and  Dempster's  rule  for  their  combination.  Section  4  shows 
that  the  two  rules  are  in  fact  closely  connected:  Jeffrey's  rule  is  the 
additive  version  of  Dempster's  rule  in  those  situations  where  the  two  rules 
are  comparable. 

Our  presentation  is  intended  as  an  introduction  to  a  growing  and  already 
sizeable  literature.  It  proceeds  mainly  by  a  series  of  examples.  For  refer¬ 
ences  to  the  literature  on  the  limitations  of  Bayes'  rule  and  for  further 
information  on  Jeffrey's  rule  see  Diaconis  and  Zabell  (1982);  for  further 
references  on  upper  and  lower  probability  and  Dempster’s  rule  see  Shafer 
(1976,  1982). 


2.  JEFFREY'S  RULE  OF  CONDITIONING 

While  the  mathematics  of  Bayes'  rule  presupposes  some  given  event  E, 
Jeffrey's  rule  assumes  the  existence  of  a  partition  {Ej,  E2,  ...»  En>  on 
which  new  probabilities  P*(E^)  are  given  (the  elements  of  a  partition  are 
by  definition,  mutually  exclusive  and  exhaustive) .  It  specifies  that  for 
any  event  A: 


n 

JEFFREY’S  RULE  P*(A)  =  T  P(A|E.)  P*(E.)  . 

i=l  1  1 

Jeffrey's  rule  is  mathematically  equivalent  to,  and  hence  applicable  only  if 


it  is  judged  that  the  "J-condition" 

(J)  P*(A|E.)  -  P(A|E.) 
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holds  for  all  A  and  i.  The  J-condition  can  be  interpreted  as  stating  that 
the  only  impact  of  the  new  evidence  was  to  change  the  probabilities  on  the 
elements  of  the  partition;  given  an  element  of  the  partition,  the  new  and 
old  probabilities  agree. 

Example  1  (Uncertain  Perception).  Suppose  we  are  about  to  hear  one  of 
two  recordings  of  Shakespeare  on  the  radio,  to  be  read  by  either  Olivier  (E) 
or  Gielgud  (E  ),  but  we  are  uncertain  as  to  which,  and  we  have  a  prior  with 
mass  y  on  Olivier  and  —■  on  Gielgud.  After  hearing  the  recording,  one 
might  judge  it  fairly  likely,  but  by  no  means  certain,  to  be  by  Olivier.  The 
change  in  belief  takes  place  by  direct  recognition  of  the  voice.  If  the  only 
impact  of  hearing  the  recording  is  to  change  the  odds  on  Olivier  and  Gielgud, 
in  the  sense  that  for  any  A,  P*(A|E)  =  P(A[E)  and  P*(a|EC)  =  P(A|E),  then 
after  assessing  P*  (E)  we  may  proceed  to  apply  Jeffrey's  rule.  (Of  course, 
the  former  might  well  not  be  the  case;  for  example  the  quality  of  the  record¬ 
ing  might  convey  additional  information  as  to  its  date  or  manufacture.) 

Richard  Jeffrey  has  argued  that  examples  of  this  type  are  the  norm: 

"it  is  rarely  or  never  that  there  is  a  proposition  for  which  the  direct 
effect  of  an  observation  is  to  change  the  observer's  degree  of  belief  in 
a  proposition  to  1"  (Jeffrey  (1962),  p.  171). 

Example  2  (Unanticipated  Knowledge).  Suppose  we  are  thinking  about 
three  possible  trials  of  a  new  surgical  procedure.  Under  the  usual  circum¬ 
stances  a  probability  assignment  P  is  made  on  the  eight  possible  outcomes 
ft  ■  {000,  001,  010,  100,  011,  101,  110,  111}  where  1  denotes  a  successful 
outcome,  0  not.  Suppose  a  colleague  informs  us  that  another  hospital  had 
performed  this  type  of  operation  100  times,  with  80  successful  outcomes. 

This  is  clearly  relevant  information  and  we  will  obviously  want  to  revise 
our  opinion.  The  information  cannot  be  put  in  terms  of  the  occurrence  of  an 
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event  in  the  original  eight  point  space  H  and  Bayes'  rule  is  not  directly 
available. 

Diaconis  and  Zabell  (1982)  discuss  four  possible  approaches  to  the 
problem  of  forming  P*  -  complete  reassessment,  retrospective  conditioning, 
the  use  of  exchangeability,  and  Jeffrey's  rule.  We  review  here  the  use  of 
Jeffrey's  rule,  as  an  example  illustrating  how  natural  partitions  {E.} 
can  arise. 

Suppose  that  the  original  probability  P  was  exchangeable,  that  is, 

P(001)  =  P(010)  =  P(100),  P(110)  =  P(101)  =  P C011) .  In  the  situation  des¬ 
cribed,  the  colleague’s  report  says  nothing  about  the  order  of  the  trials  and 
we  may  thus  require  the  new  P*  to  remain  exchangeable.  Consider  the  partition 
{ E_,  E^  E2,  Ej},  where  Ei  is  the  set  of  outcomes  with  i  ones:  EQ  = 

(000),  Ex  =  (001,  010,  100},  E2  =  (110,  101,  Oil},  E3  =  {ill}.  The  exchange¬ 
ability  of  both  P  and  P*  is  equivalent  to  Jeffrey's  condition: 

P(A|E.)  =  P*(A|Ei)  , 

and  so,  to  complete  the  assignment  of  P*,  we  need  only  undertake  an  assess¬ 
ment  of  P*(E^).  Then  P*  is  determined  by  Jeffrey's  rule:  for  any  set  A 

n 

P*(A)  =  l  P(A|E.)  P* (E . )  . 
i=l  1  1 

Example  3  (Bayes’  Rule).  If  (1)  the  partition  consists  of  a  set  E 
and  its  complement  EC,  and  (2)  if  P*(EC)  =  0,  then  Jeffrey's  rule  reduces 
to  Bayes*  rule  P*(A)  =  P(A|E). 

Diaconis  and  Zabell  (1982)  link  the  J-condition  with  the  statistical 
concept  of  sufficiency,  and  show  that  Jeffrey’s  rule  gives  the  closest  prob¬ 
ability  P*  to  P  with  prescribed  values  P*(E^).  Our  1982  paper  also 
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gives  continuous  versions  of  the  rule,  and  an  analysis  of  Jeffrey's  rule 
when  two  or  more  sources  of  evidence  are  considered  simultaneously. 

3.  UPPER  AND  LOWER  PROBABILITIES,  AND  DEMPSTER'S  RULE  OF  COMBINATION 
We  begin  our  discussion  with  an  example  drawn  from  Diaconis  (1978). 

It  concerns  the  well  known  problem  of  the  three  prisoners: 

Of  three  prisoners,  a,  b^,  and  £,  two  are  to  be  executed 
but  £  does  not  know  which.  He  therefore  says  to  the  jailer, 

'Since  either  b  or  £  is  certainly  going  to  be  executed,  you 
will  give  me  no  information  about  my  own  chances  if  you  give  me 
the  name  of  one  man,  either  b^  or  £,  who  is  going  to  be  executed.' 
Accepting  this  argument,  the  jailer  truthfully  replies,  'b  will  be 
executed. '  Thereupon  a  feels  happier  because  before  the  jailer 
replied,  his  own  chance  of  execution  was  two-thirds,  but  after¬ 
ward  there  are  only  two  people,  himself  and  £  who  could  be  the 
one  not  executed,  and  so  his  chance  of  execution  is  one-half. 

Is  a  justified  in  believing  that  his  chances  of  escaping  execution 
have  improved?  Consider  the  set  of  possible  outcomes 

S  =  {(a,b)  (a,£)  (b,£)  (£,b)} 

where,  for  example,  (£,y  means  a  will  live  and  the  jailer  answers  b. 

In  the  classical  Bayesian  solution  of  this  problem  (see,  e.g.,  Gardner  (1961), 
Chapter  19),  a,  b,  and  £  are  assumed  equally  likely  to  be  pardoned  and  if  a 
is  to  be  set  free,  it  is  assumed  that  the  jailer  will  answer  by  choosing  b 
or  £  with  probability  These  assumptions  translate  into  the  probability 
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F  on  S  with  P(a,b)  =  P(£,£)  =  i,  P(b,£)  =  P(£,b)  =  y,  and  Bayes' 
rule  gives 

P(a,b)  l 

P(a  livesj  jailer  says  b)  =  ^b)  +  p^j  =  3  , 

i.e.,  £'s  chances  have  not  improved. 

We  will  discuss  three  ways  to  model  this  problem  using  upper  and 
lower  probabilities  P„,  P*.  Upper  and  lower  probabilities  are  functions 
defined  on  the  subsets  of  a  set  S  satisfying 

Cl  P*  C4>)  =  0,  P*(S)  =  1  , 

C2  P* (A)  =  1-P*(AC)  , 

and  the  inequalities 

C3  P*(A  u  ...  <J  A )  >  l  P*(A.)  -  l  P*(A  n  A  ) 

i<j  J 

+  ...  +  (-l)n+1  P*(A.  n  ...  n  AJ  . 

1  n 

Conditions  Cl,  C2,  and  C3  will  be  motivated  later  on.  For  the  present  we 

* 

note  that  the  definitions  imply  P*  £  P  ,  so  that  the  upper-lower  pair 

* 

(P#,  P  )  may  be  thought  of  as  bounds  on  some  "true  probability"  P,  with 
* 

P#  <  P  <  P  .  A  simple  example  is  the  vacuous  upper- lower  pair 
# 

P*(A)  =  0  if  A  c  S  ,  P*(S)  ■  1  . 

t 

The  vacuous  pair  is  often  suggested  as  a  way  of  quantifying  a  state  of  "no 
knowledge." 

Arthur  Dempster  has  suggested  that,  given  the  occurrence  of  an  event  E, 
the  appropriate  way  of  modifying  an  upper-lower  pair  to  a  new  upper-lower 
pair  incorporating  the  new  information  is  via: 
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DEMPSTER’S  RULE  P*(AjE)  =  P* (A  n  E)/P*(E)  . 

A  motivation  for  Dempster's  rule  will  also  be  given  later.  First  we  return 
to  the  three  prisoner  problem  and  show  how  it  bay  be  analyzed  using 
different  upper-lower  pairs  and  Dempster's  rule. 

Model  1.  Suppose  that  prisoner  £  models  his  (lack  of)  knowledge 
by  putting  the  vacuous  upper-lower  pair  on  the  four-point  set  S.  Then 
the  definitions  imply  P*(a  will  live(jailer  says  b)  =  1,  P*(a  will  live| 
jailer  says  b)  =  0.  Thus,  with  no  assumptions  on  the  problem,  the  jailer’s 
information  does  not  reduce  his  uncertainty,  and  the  conditional  upper- lower 
pair  remains  vacuous. 

Model  2.  Suppose  that  a  assumes  that  the  initial  decision  as  to  who 
will  live  is  made  at  random,  but  assumes  nothing  about  how  the  jailer  will 
act  except  that  he  will  tell  the  truth.  One  way  to  model  this  is  to  consider 
the  space  L  =  {a,b,c};  the  probability  P  on  L  corresponding  to  the  random 
choice  of  who  will  live,  i.e.  P(a)  =  P(b)  =  P(c)  =  y;  and  the  multivalued  map 
T,  from  L  to  the  subsets  of  S,  given  by 

T(a)  =  (a,y  u  (a,c),  T(b)  =  (b ,c) ,  T(£)  =  (c,b)  . 

Thus  r  delineates  the  possible  outcomes  when  a,  b,  or  £  are  pardoned. 

Dempster  has  described  how  an  upper- lower  pair  can  be  constructed  on 
S  whenever  a  set  L,  probability  P  on  L,  and  multivalued  map 
T:  L  -*■  subsets  of  S  are  given.  Define 

P*(A)  =  P{1  £  L:  ra)  n  A  ^}  ,  and  F\(A)  =  P{fc:  r(£)  c  A}  . 

P*  and  P*  represent  the  largest  and  smallest  probabilities  that  can  be 
assigned  to  A  consistent  with  T  and  P. 
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The  French  mathematician  Gustave  Choquet  (1953)  proved  the  following 
important  result. 

1 

Theorem.  Every  upper-lower  pair  constructed  in  this  way  from  a  multi¬ 
valued  map  satisfies  conditions  Cl,  C2,  and  C3.  Conversely,  given  an  upper- 
lower  pair  satisfying  Cl,  C2,  and  C3,  there  exists  a  set  L,  a  probability 
P  on  L,  and  a  multivalued  map  T:  L  -*•  subsets  of  S  which  realizes  the  upper- 
lower  pair. 

This  is  the  promised  motivation  for  Cl,  C2,  and  C3.  Any  function  P* 
satisfying  Cl,  C2,  C3  is  said  to  be  a  capacity  of  infinite  order.  The  infinite 
system  of  inequalities  C3  are  known  as  the  Block-Marschack  inequalities  in  the 
psychology  of  choice;  see  the  article  by  Batchelder  in  this  volume. 

Returning  to  the  three  prisoner  example,  we  have  for  the  upper-lower 
pair  P#,  P*  that  arises  from  L,  P,  and  T: 

P*(jailer  says  b)  =  P*{ (a,b)  u  (£,b)}  =  P{a  u  c)  =  -■ 

P* (jailer  says  b)  =  1-P*{ jailer  does  not  say  b}  =  1-P*{(a_,c)  u  (b,c)}  =  -j  . 

This  result  is  intiutively  reasonable:  if  the  jailer  said  b  when  he  truth- 

2 

fully  could  he  would  say  b  y  of  the  time.  If  the  jailer  avoided  saying  b^ 
whenever  he  truthfully  could,  he  would  say  b  j  of  the  time.  Dempster's 
rule  of  conditioning  then  gives 

P*(a  will  live] jailer  says  b)  =  P*(a  will  livejjailer  says  b)  =  i  . 

Thus,  with  this  set  of  assumptions  a.  is  justified  in  reasoning  exactly  as 
described  in  the  original  version  of  the  problem.  Observe  that  after  Dempster 
conditioning  the  two  members  of  the  upper- lower  pair  are  actually  equal, 
coalescing  to  a  bona  fide  probability. 

A  "lazy  Bayesian"  could  regard  the  formation  of  an  upper- lower  pair 
based  on  a  multivalued  mapping  as  a  way  of  proceeding  without  quantifying 

.  ,/ 

i  i 
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belief  within  the  elements  of  r(£).  The  calculations  result  in  bounds  which 
would  be  useful  in  checking  a  more  refined  quantification. 

Here  is  Dempster's  motivation  for  his  rule  of  conditioning,  via  multi¬ 
valued  mappings.  Consider  a  pair  of  probability  spaces  and  multivalued 
mappings: 

r  r 

(Lj.Pj)  -  i  ,  CL2»P2)  *  i  , 

(where  &  denotes  the  subsets  of  S) .  Define  a  product  space  (Lj*L2, 
and  I^xl^:  Im*L2  ^  ^ 

WVV  =  TjUj)  n  r2u2)  . 

It  is  easy  to  show  that 

D1  If  1^(2.)  =  S  then  the  upper-lower  pair  associated  with  is  vacuous 

and  the  upper- lower  pair  associated  with  the  product  is  identical 

to  the  upper- lower  pair  associated  with  r2- 

D2  If  either  the  component  upper-lower  pairs  is  a  probability,  then  the 
product  is  a  probability. 

D3  If  Tj (J.)  =  E,  then  the  product  yields  Dempster's  rule  of  conditioning. 

For  further  discussion  of  this  motivation  for  Dempster’s  rule,  see  Dempster 
(1968). 

To  us,  the  multivalued  mapping  approach  to  upper- lower  pairs  seems  pre¬ 
ferable  to  their  direct  use  and  interpretation  (as  favored  by  Shafer  (1976)). 
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To  discuss  Model  2  and  the  example  further,  consider  the  general  Bayesi 

solution  to  the  three  Poisson  problem:  let  nr  ,  it.  ,  and  m  be  the  prior 

£  b  £ 

probabilities  that  £,  b^  and  £  are  pardoned,  and  let  £  be  the  prob¬ 
ability  that  the  jailer  names  b  when  a  is  pardoned.  Then 

P(a,b)  =  TTa  ,  P (a,£)  =  TTa(l-£)  , 

P(b,c)  =  nb  ,  P(£,b)  =  ttc  , 

and 

P(a  lives]  jai  ler  says  b)  =  -  . 

v—  1 J  J  —  it  p  +  IT 

_  £ 

For  Model  2  we  have  7Ta  =  tt^  =  tt^  =  y,  with  £  remaining  a  free  para¬ 
meter.  This  generates  a  family  9  of  possible  probability  measures.  This 
family  can  be  used  to  define  a  different  kind  of  upper- lower  probability, 
say  U  and  L,  defined  by 

U (A)  =  max{ P (A j :  P  e  9)  and  L(A)  =  min{P(A):  P  £  ?}  , 

In  this  case,  it  is  easy  to  check  that  U  and  are  exactly  the  same  as 
those  derived  via  the  multivalued  map  L.  In  general,  however,  upper- lower 
pairs  defined  by  sups  and  infs  will  not  be  capacities  of  infinite  order, 
but  merely  capacities  of  order  2.  Wally  and  Fine  (1979)  contains  further 
discussion. 

Note  here  that  the  conditional  probabilities  generated  by  9  range 
from  0  to  y,  while  Dempster's  rule  of  conditioning  picks  out  the  unique 
value  — .  This  is  a  disturbing  result  fc  a  Bayesian,  since  it  calls  into 
question  both  the  interpretation  and  justification  of  Dempster's  rule. 
Either  Dempster's  rule  contains  further  hidden,  implicit  assumptions,  here 


responsible  for  narrowing  down  the  range  of  possible  conditional  probabilities 
to  but  one,  o_r  it  operates  in  a  manner  verry  different  from  ordinary,  Bayesian 
conditioning,  in  which  case  we  would  wish  some  further  guidance  as  to  its 
interpretation  and  meaning.  Mere  surface  plausibility  is  insufficient,  for 
it  is  possible  to  suggest  at  least  one  equally  plausible  alternative  to 
nemoster's  rule,  namely 

P# (A  and  B) 

P.CAlB)  =  - -  and  P*(A|B)  =  1  -  P*(AC  B)  . 

This  yields  a  rule  of  conditioning  different  from  Dempster's,  yet  the  resulting 
conditional  set  functions  are  capacities.  In  what  sense  is  one  of  them  right? 
(Note  that  for  this  method  of  conditioning  the  upper- lower  pair  for  Model  2 
of  the  three  prisoner  problem  yields  upper  and  lower  conditional  probabilities 
of  O’.) 

Model  3.  Suppose  a  knows  nothing  about  the  selection  process  for  who 
will  live,  but  assumes  (or  is  told)  that  if  he  lives,  the  jailer  will  choose 
randomly  between  answering  b  or  c.  (Of  course,  if  the  jailer  knows  b  is 
to  live,  he  will  answer  £,  and  vice  versa.)  This  problem  can  be  modeled  by 
assuming  that  three  different  probability  measures  are  given  on  the  set 
W  =  (b,£)  of  the  jailer's  possible  answers:  P  (b)  =  P  (c)  =  P  (c)  =  1; 

P  (b)  =  1.  Given  the  jailer's  answer.  Chapter  11  of  Shafer  (1976)  proposes 
a  method  related  to  direct  use  of  likelihood  for  deriving  an  upper-lowr  pair 
on  the  parameter  set  L  =  {£>b.c} .  This  yields  P*(£  will  live| jailer  says  b) 

=  0,  P*(a  will  live| jailer  says  b)  =  -j.  In  this  model,  before  questioning  the 
jailer,  a  might  have  expressed  his  ignorance  by  P* (a  lives)  =  0, 

P* (a  lives)  =  1.  After  learning  will  die,  £  can  no  longer  be  so 
optimistic. 

Again,  the  comparison  with  the  Bayesian  analysis  is  instructive.  Now 
ir  ,  it,  ,  it  are  arbitrary  and  p  =  so  that  the  resulting  conditional 
probabilities  for  P(a  will  live  jailer  says  b)  range  from  0  to  I . 
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Thus  while  Shafer’s  method  does  not  suffer  in  this  case  from  the  defect  of 
picking  out  a  unique  conditional  probability,  the  range  spanned  by  his 
resulting  upper-lower  pair  differs  markedly  from  that  arising  from  the 
Bayesian  analysis,  again  calling  into  question  both  the  interpretation  and 
justification  for  the  method. 

Dempster  (1966)  has  proposed  a  different  approach  to  this  problem.  In 
general  the  two  methods  do  not  agree,  but  in  this  simple  example  they  do,  and 
hence  the  objection  just  voiced  to  Shafer's  analysis  applies  with  equal  force 
to  Dempster's. 

3.  RELATIONSHIPS  BETWEEN  JEFFREY’S  RULE  AND  DEMPSTER'S  RULE 

Glenn  Shafer  has  observed  that  Jeffrey's  rule  and  Dempster’s  rule  agree 
in  certain  cases.  This  is  an  easy  consequence  of  the  three  properties  D1-D3 
of  Dempster's  rule  given  at  the  end  of  the  preceding  section.  To  be  precise, 

let  Pj  be  a  probability  on  a  set  S,  let  be  a  partition  of  S, 

and  suppose  that  P2(Ei)  are  Posit*ve  numbers  summing  to  1.  Define  multi¬ 
valued  mappings  I\  from  L^  -*•  subsets  of  S  as  follows: 

Ej  =  S,  TjCs)  =  s  , 

L 2  =  •••»  n},  r 2 C i )  =  • 

The  product  of  (P^.Lj.Tj)  and  combine  to  give  a  probability  on 

S  because  of  property  C3.  Shafer  (1981,  Section  7)  shows  that  this  is  pre¬ 

cisely  the  probability  given  by  Jeffrey's  rule. 

Thus  Dempster's  rule  may  be  viewed  as  a  generalization  of  Jeffrey's 
rule.  The  difference  between  them  may  be  summarized  as  follows: 

1.  Jeffrey's  rule  works  with  ordinary  probabilities  which  have  a 
well  understood  interpretation  in  a  variety  of  real  world  situations. 
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Dempster's  rule  works  with  upper  and  lower  probabilities  which  presently 
lack  an  operational  interpretation,  objective  or  subjective. 

2.  Dempster's  rule  is  a  way  to  pool  fairly  general  types  of  informa¬ 
tion.  If  one  is  willing  to  work  outside  the  world  of  well  defined  probabilities, 
upper- lower  pairs  representing  information  from  very  general  sources  can  be 
combined.  An  additive  approach  to  the  combination  of  different  types  of 
evidence  is  given  in  Sections  3,  4,  5  of  Di3conis  and  label!  (1982).  The 
comparison  of  the  two  approaches  is  instructive:  Dempster's  rule  is  based 
on  an  intuitive  notion  of  independence;  the  method  using  Jeffrey’s  rule  that 
we  suggest  is  not  tied  to  such  independence. 

Finally,  ir  is  worth  considering  a  problem  that  neither  theory  claims 
to  know  how  to  treat.  Suppose  we  have  a  probability  P^  defined  on  a  class 
?  of  subsets  of  a  space  S.  After  observation  or  reflection  we  decide 
that  we  need  to  work  with  a  richer  collection  of  sets  j*,  perhaps  even  a 
larger  basic  space  S*.  For  example,  new  data  may  force  us  to  consider  out¬ 
comes  previously  thought  impossible  or  unimportant.  How  should  we  proceed 
to  extend  £,  changing  it  as  little  as  possible?  Several  procedures  are 
available  under  special  circumstances,  but  any  semblance  of  a  general  theorv 
is  presently  lacking. 
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